Live Speaker Identification in Meetings – “who Is Speaking Now?”
نویسندگان
چکیده
The following paper presents an application that fuses the currently artificially separated tasks of speaker identification and speaker diarization. The presented method allows online identification of who is currently speaking using a single far-field microphone in a meeting scenario. It is able to recognize the current speaker after any two seconds of speech. An evaluation of the robustness of the algorithm using the AMI Meeting Corpus resulted in a Diarization Error Rate of 12.67 %.
منابع مشابه
Audio-Video Speaker Diarization for Unsupervised Speaker and Face Model Creation
Our goal is to create speaker models in audio domain and face models in video domain from a set of videos in an unsupervised manner. Such models can be used later for speaker identification in audio domain (answering the question ”Who was speaking and when”) and/or for face recognition (”Who was seen and when”) for given videos that contain speaking persons. The proposed system is based on an a...
متن کاملSpeech/Non-Speech Detection in Meetings from Automatically Extracted low Resolution Visual Features
In this paper we address the problem of estimating who is speaking from automatically extracted low resolution visual cues in group meetings. Traditionally, the task of speech/non-speech detection or speaker diarization tries to nd “who speaks and when” from audio features only. In this paper, we investigate more systematically how speaking status can be estimated from low resolution video We e...
متن کاملA hybrid approach to online speaker diarization
This article presents a low-latency speaker diarization system (“who is speaking now?”) based on a hybrid approach that combines a traditional offline speaker diarization system (“who spoke when?”) with an online speaker identification system. The system fulfills all requirements of the diarization task, i.e. it does not need any a-priori information about the input, including no specific speak...
متن کاملThe influence of vocal effort on human speaker identification
Although many of the acoustic cues used for speaker identification change systematically with the voice level of the talker, little is known about the influence that vocal effort has on the identification of individual talkers by human listeners. In this experiment, listeners were trained to identify four different same-sex talkers speaking at one of three different levels of vocal effort (whis...
متن کاملÛkio Technologinis Ir Ekonominis Vystymas Technological and Economic Development of Economy
The problem of speaker identification is investigated. Basic segments pseudo stationary intervals of voiced sounds are used for identification. The identification is carried out, comparing average distances between an investigative and comparatives. The coefficients of the linear prediction model (LPC) of a vocal tract are used as features of identification. Such a problem arises in stenographi...
متن کامل